Rows: 1,454
Columns: 5
Rowwise:
$ geoid <chr> "53001950200", "53005010600", "53005010902", "53007960…
$ hh_vmt <dbl> 49.22, 39.50, 36.03, 52.59, 39.31, 31.31, 40.17, 57.66…
$ vote_rep_pct <dbl> 77.975856, 52.493079, 57.784743, 62.749887, 53.021292,…
$ vote_i0732n_pct <dbl> 78.69265, 65.03399, 71.65084, 72.74983, 65.13168, 60.5…
$ geometry <GEOMETRY [m]> POLYGON ((-3507230 8109216,..., POLYGON ((-35…
1 Introduction
2 Data & Methods
2.1 Data Sources
- Voting Precinct Shapefiles: https://www.sos.wa.gov/elections/data-research/election-data-and-maps/reports-data-and-statistics/precinct-shapefiles
- Election Results: https://www.sos.wa.gov/elections/data-research/election-data-and-maps/election-results-and-voters-pamphlets
- American Community Survey: https://www.census.gov/programs-surveys/acs/data.html
- 2017 Local Area Transportation Characteristics for Households https://www.bts.gov/latch/latch-data
2.2 Data
| Name | model_data_skim |
| Number of rows | 1431 |
| Number of columns | 4 |
| _______________________ | |
| Column type frequency: | |
| character | 1 |
| numeric | 3 |
| ________________________ | |
| Group variables | None |
Variable type: character
| skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
|---|---|---|---|---|---|---|---|
| geoid | 0 | 1 | 11 | 11 | 0 | 1431 | 0 |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| hh_vmt | 0 | 1 | 41.36 | 8.28 | 13.48 | 35.78 | 41.36 | 47.48 | 61.76 | ▁▂▇▇▂ |
| vote_rep_pct | 0 | 1 | 39.22 | 16.51 | 2.15 | 27.66 | 40.49 | 50.94 | 81.60 | ▃▆▇▅▁ |
| vote_i0732n_pct | 0 | 1 | 59.71 | 11.28 | 23.45 | 53.13 | 61.15 | 67.67 | 87.60 | ▁▂▇▇▂ |
2.3 Models
2.3.1 Ordinary Least Squares (OLS)
OLS linear regression model:
Parameter | Coefficient | SE | 95% CI | t(1428) | p
-------------------------------------------------------------------------
(Intercept) | 28.86 | 0.42 | [28.05, 29.68] | 69.36 | < .001
hh vmt | 0.16 | 0.01 | [ 0.14, 0.18] | 14.52 | < .001
vote rep pct | 0.61 | 5.62e-03 | [ 0.60, 0.63] | 109.46 | < .001
Linear model assumption checks:
Spatial Autocorrelation check (Moran I test):
Moran I test under randomisation
data: residuals(model_lm)
weights: model_spatial_weights
n reduced by no-neighbour observations
Moran I statistic standard deviate = 29.982, p-value < 2.2e-16
alternative hypothesis: greater
sample estimates:
Moran I statistic Expectation Variance
0.4884553310 -0.0007007708 0.0002661758
2.3.2 Spatially Lagged Regression
Call:
lagsarlm(formula = model_lm, data = model_data, listw = model_spatial_weights,
zero.policy = TRUE)
Residuals:
Min 1Q Median 3Q Max
-19.7908 -1.6062 0.1467 1.8119 18.9670
Type: lag
Regions with no neighbours included:
523 1102 1253
Coefficients: (asymptotic standard errors)
Estimate Std. Error z value Pr(>|z|)
(Intercept) 19.340830 0.681397 28.384 < 2.2e-16
hh_vmt 0.151437 0.010106 14.984 < 2.2e-16
vote_rep_pct 0.454746 0.010966 41.468 < 2.2e-16
Rho: 0.27185, LR test value: 276.53, p-value: < 2.22e-16
Asymptotic standard error: 0.016331
z-value: 16.646, p-value: < 2.22e-16
Wald statistic: 277.09, p-value: < 2.22e-16
Log likelihood: -3504.669 for lag model
ML residual variance (sigma squared): 7.7375, (sigma: 2.7816)
Nagelkerke pseudo-R-squared: 0.93831
Number of observations: 1431
Number of parameters estimated: 5
AIC: 7019.3, (AIC for lm: 7293.9)
LM test for residual autocorrelation
test value: 405.56, p-value: < 2.22e-16
Parameter comparison: OLS vs Spatial Lag
Parameter | model_lm | model_spatial_lag
----------------------------------------------------------
(Intercept) | 28.86 (28.05, 29.68) | 19.34 (18.01, 20.68)
hh vmt | 0.16 ( 0.14, 0.18) | 0.15 ( 0.13, 0.17)
vote rep pct | 0.61 ( 0.60, 0.63) | 0.45 ( 0.43, 0.48)
rho | | 0.27 ( 0.24, 0.30)
----------------------------------------------------------
Observations | 1431 |
Comparison of Adjusted R2/Pseudo Adjusted R2: OLS vs Spatial Lag
# A tibble: 1 × 2
lm spatial_lag
<dbl> <dbl>
1 0.925 0.938
Spatially lagged regression model residuals:
2.4 Methodology Notes
- Income should not be included in our regression because it is used in the model that estimates household VMT (see LATCH Methodology p. 10)

